IBminer: A Text Mining Tool for Constructing and Populating InfoBox Databases and Knowledge Bases

نویسندگان

  • Hamid Mousavi
  • Shi Gao
  • Carlo Zaniolo
چکیده

Knowledge bases and structured summaries are playing a crucial role in many applications, such as text summarization, question answering, essay grading, and semantic search. Although, many systems (e.g., DBpedia and YaGo2) provide massive knowledge bases of such summaries, they all suffer from incompleteness, inconsistencies, and inaccuracies. These problems can be addressed and much improved by combining and integrating different knowledge bases, but their very large sizes and their reliance on different terminologies and ontologies make the task very difficult. In this demo, we will demonstrate a system that is achieving good success on this task by: i) employing available interlinks in the current knowledge bases (e.g. externalLink and redirect links in DBpedia) to combine information on individual entities, and ii) using widely available text corpora (e.g. Wikipedia) and our IBminer text-mining system, to generate and verify structured information, and reconcile terminologies across different knowledge bases. We will also demonstrate two tools designed to support the integration process in close collaboration with IBminer. The first is the InfoBox Knowledge-Base Browser (IBKB) which provides structured summaries and their provenance, and the second is the InfoBox Editor (IBE), which is designed to suggest relevant attributes for a userspecified subject, whereby the user can easily improve the knowledge base without requiring any knowledge about the internal terminology of individual systems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ارائه مدلی برای استخراج اطلاعات از مستندات متنی، مبتنی بر متن‌کاوی در حوزه یادگیری الکترونیکی

As computer networks become the backbones of science and economy, enormous quantities documents become available. So, for extracting useful information from textual data, text mining techniques have been used. Text Mining has become an important research area that discoveries unknown information, facts or new hypotheses by automatically extracting information from different written documents. T...

متن کامل

Creating a Knowledge Base of Biological Research Papers

Intelligent text-oriented tools for representing and searching the biological research literature are being developed, which combine object-oriented databases with artificial intelligence techniques to create a richly structured knowledge based of Materials and Methods sections of biological research papers. A knowledge model of experimental processes, biological and chemical substances, and an...

متن کامل

1 Creating a Knowledge Base of Biological Research Papers

Intelligent text-oriented tools for representing and searching the biological research literature are being developed, which combine object-oriented databases with artificial intelligence techniques to create a richly structured knowledge base of Materials and Methods sections of biological research papers. A knowledge model of experimental processes, biological and chemical substances, and ana...

متن کامل

Populating Knowledge Based Decision Support Systems

Knowledge-based decision support systems (KBDSS) hold up business and organizational decisionmaking activities on the basis of the knowledge available concerning the domain under question. One of the main problems with knowledge bases is that their construction is a time-consuming task. A number of methodologies have been proposed in the context of the Semantic Web to assist in the development ...

متن کامل

Text-mining tools for optimizing community database curation workflows in neuroscience

The emphasis of multilevel modeling techniques in the Neurosciences has led to an increased need for large-scale databases containing neuroscientific data. Despite this, such databases are not being populated at a rate commensurate with their demand amongst Computational Neuroscientists. The reasons for this are common to scientific database curation in general–limitation of resources. Much of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • PVLDB

دوره 6  شماره 

صفحات  -

تاریخ انتشار 2013